1 Introductions

Human safety in buildings has become an increasingly critical issue, especially during emergencies, such as fires, earthquakes, and terrorist attacks (Guha-Sapir et al. 2012). In these circumstances, occupant behavior is one of the most critical determinants of occupant safety (Kobes et al. 2010). A recent systematic review of human behavior under building emergencies (Lin et al. 2020) shows that emergency evacuations were the primary behavior and have been researched extensively since the 1950s and especially after the World Trade Center (WTC) attack in 2001 (Gershon et al. 2012).

According to Abdelgawad and Abdulhai (2009), an emergency evacuation can be understood as the movement of people from a hazardous area to safe destinations. For the estimated time-dependent features of an emergency evacuation (Kobes et al. 2010), occupants’ behavior is considered in terms of two main categories of processes: (1) the pre-evacuation period and (2) the movement period. The second phase (i.e., movement period) has been widely studied by researchers (Lin et al. 2020), with extensive studies focusing mainly on occupants’ primary goal in building emergencies, that is, how to arrive at targeted destinations safely in the shortest possible time, and wayfinding is the primary human behavior that occurs (Vilar et al. 2014). Relatively little attention has been devoted to the pre-evacuation period (Lin et al. 2020; Liu and Lo 2011).

Studying human behavior in building emergencies is not easy. Mainly, experimental research is challenging to conduct (Galea et al. 2017; Lin et al. 2020), as ethical and safety issues (e.g., putting participants in danger), enormous time and financial costs, the balance of ecological validity and experimental control continued to be challenging (Meng and Zhang 2014; Duarte et al. 2014). Also, the vast majority of the related studies are qualitatively oriented (Liu and Lo 2011; Zhao et al. 2009; Kinateder et al. 2014). Quantitative data related to the pre-evacuation period are acquired through literature reviews of previous reports on building fires (Lin et al. 2020; Tong and Canter 1985). The assessment of human behavior during pre-evacuation is mainly done after a fire has occurred and is subject to problems of highly emotional implications of recalling a traumatic incident, as the malleability of memory can affect survivors’ accounts (Arias et al. 2019).

Additionally, the T-3 pattern alarm, as defined by the up-to-date ISO 8201: 2017 “Alarm systems—Audible Emergency Evacuation Signal,” is expected to become the universal standard of emergency evacuation signals, meaning that an emergency (e.g., fire, gas leaks, explosion, and nuclear radiation) requires immediate evacuation (see details in the original ISO 8201: 2017). However, previous studies suggested that it is unrealistic to expect that occupants will immediately start evacuation upon hearing such a signal (Liu and Lo 2011; Proulx and Laroche 2003), as Proulx and Laroche measured the efficacy of the T-3 alarm, and results showed that participants could barely identify the alarm as an evacuation signal, and it was judged as a signal not conveying urgency.

In this context, there is a need to find alternative methods for quantitative development regarding human behavior in building emergencies and investigate the inefficiency of the contemporary evacuation alarm, mainly considering the pre-evacuation period.

1.1 Emergency behavior in pre-evacuation period

The pre-evacuation period time starts from the beginning of fire to the moment when a person decides to attempt to evacuate the building (Fahy and Proulx 2001), and it has been described as the delay time because it lasts from a few seconds to hours, which can significantly affect total evacuation time/efficiency (Lin et al. 2020; Kobes et al. 2010).

Based on the fire emergency timeline (Lin et al. 2020), the pre-evacuation period further contains two parts related to human behaviors: (i) the information-seeking phase, followed by (ii) a preparation phase. The information-seeking phase starts with an alarm or cue and ends with the occupants’ first response to stop the pre-emergency activities, e.g., stopping machinery, shopping, watching football (Gershon et al. 2012; Zhao et al. 2009). However, a prior study found that 65% of the people who had experienced building emergencies continued their pre-emergency activities before starting to evacuate (Averill et al. 2013; Lindell and Perry 2012).

People are not in buildings with fire uppermost in their minds, but for other purposes to which they have a considerable commitment that may continue guiding their behavior even when an emergency occurs (Averill et al. 2013; Purser and Bensilum 2001). Primarily after they have invested a significant amount of time, money, or effort in their activity, they will be more reluctant to disengage that task in response to an alarm signal if that means that they must give up their work (Gershon et al. 2012). Different pre-emergency activities have been documented in papers and reports (Lin et al. 2020; Donald and Canter 1992), and, for example, in the WTC evacuation study (Gershon et al. 2012), people who engaged in two or more delaying behaviors were over three times more likely to be delayed in their initiation to evacuation. Moreover, researchers indicated that only when these important activities were completed, occupants start to move to the next phase and attempt to evacuate (Purser and Bensilum 2001).

Empirical evidence has also indicated that pre-emergency activities may significantly affect people’s emergency responses to other circumstances, especially the evacuation signal (Bruck and Ball 2007) because when individuals are involved or committed (engagement) to certain pre-emergency activities, they become too focused on their primary activity to notice their surroundings due to the narrowed perception caused by high cognitive workload (Wogalter, 2006; Chandler and Sweller 1991).

Even if people successfully perceived the alarm and ceased the pre-emergency activities, there are plenty of other things people do in the following preparation phase (e.g., collecting personal belongings and gathering family members) than just heading directly to the exits (movement period) (Kuligowski and Hoskins 2010; Fahy and Proulx 2001).

1.2 Virtual reality (VR) in emergency behavior research

Conducting drills and experiments that can evoke realistic evacuation behaviors in building emergencies is difficult (Kinateder et al. 2014). The advantages of using immersive VR techniques over existing approaches allow the creation of low-risk, cost-efficient, and highly controllable virtual environments (VEs) for conducting virtual evacuation experiments (Vilar et al. 2014).

For instance, evacuees’ decision-making processes, such as route choice (Vilar et al. 2014; Kinateder et al. 2014; Meng and Zhang 2014), waiting time (Andrée et al. 2016), and helping behavior (Gamberini et al. 2015), were investigated in VR-based experiments. The impact factors of evacuee behaviors, such as social impact factors (Kinateder and Warren 2016; Kinateder et al. 2014), environmental impact factors (Duarte et al. 2014; Ronchi et al. 2016), and emergency evacuation management (Wang et al. 2014; Smith and Trenholme 2009), were also studied in experiments with VR.

Notably, researchers have validated the use of VR for examining pre-evacuation behavior under fire (Arias et al. 2019; Kinateder and Warren 2016; Bourhim and Cherkaoui 2020). Arias and colleagues (2019), for example, used the VR approach to replicate findings (pre-evacuation behaviors) from the “classic” incident—MGM Grand fire. However, they targeted their research to the validation of VR in the forensic investigation area and limited their research scope to human behaviors only when participants were trapped inside the hotel room, without considering the influence of pre-emergency activity.

Considering the increasing use of VR-based methodology in investigating human behavior during building fire emergencies (Vilar et al. 2014; Kinateder et al. 2014), the study of critical factors that could influence emergency behaviors, mainly those that occur before people decide to evacuate (pre-evacuation behaviors), such as the engagement level in tasks, can provide insights about how to improve pre-evacuation efficiency, increasing building safety.

In this context, the main objective of this research is to verify the use of VR-based methodology as a helpful way for studying human behaviors during the pre-evacuation period of building emergencies. The specific objectives were (1) to evaluate human behavior compliance with the ISO-type evacuation alarms, as behavioral compliance is the most important and practical measurement for warning efficiency (Kobes et al. 2010); (2) investigate the influence of pre-emergency activity on evacuation efficiency; and 3) to provide a method/tool for studying the future generation of alarms.

For this, a virtual emergency evacuation experiment is conducted, mainly considering two engagement levels on pre-evacuation activities (i.e., low and high), which refers to the competing tasks one is doing (e.g., the more tasks he/she is doing at the same time, the more engaged he/she is). The adequacy of the VR-based methodology was determined by examining whether the setup would be sufficiently sensitive to detect differences between manipulated engagement conditions and produce human behaviors similar to real situations documented in the literature. A hypothesis was formulated for the main objective: The level of task engagement will influence human behavioral compliance with evacuation alarms during the pre-evacuation period. Information in this work will be helpful for researchers that use VR to study human behaviors under dangerous situations, ultimately contributing to the improvements of human–media interaction.

2 Methodology

An experiment focused on the influence of pre-emergency activity on the effectiveness of ISO-type evacuation alarm was carried out. Thus, immersive VR was used as an interaction environment in which the study’s hypothesis was tested according to the following design.

2.1 Experiment design

Two experimental conditions were considered: (1) pre-emergency high engagement (PHE) condition—participants had a task to accomplish in the narrative; and (2) pre-emergency low engagement (PLE) condition—participants had to wait passively in the narrative.

The study’s independent variable was the engagement level on the pre-emergency activity (low and high). The dependent variables were (a) the behavioral compliance with the fire alarm, which is after the alarm, the decision of participants following the alarm and evacuating from the building (complied), or ignoring the alarm and remaining in the building (not complied), and (b) the patterns of evacuation behavior after the alarm. The study’s control variables were Simulation Sickness, Sense of the presence, and usability.

2.2 Participants

Participants from a company were selected to make the sample more representative of the real situation. Sixty workers from a Chinese medical equipment company volunteered for this study, with thirty randomly assigned to each condition. The sample size of each group was determined based on previous studies (Duarte et al. 2014; Vilar et al. 2013). For the PHE condition, of 30 participants aged between 20 and 34 years old (mean= 27; SD=4.3), 15 were males. For the PLE condition, of 30 participants aged between 19 and 36 years old (mean= 28.5; SD=4.5), 15 were males.

All participants completed an informed consent form before the VR simulation. They all reported no physical or mental conditions that would prevent them from participating in a VR simulation. They were informed about the side effects of using VR, such as cybersickness, and they were advised to stop the simulation anytime they wanted. Participants were also asked about their past experience using VR, and for PLE, 6/30 participants had played VR for less than 60 minutes during the past one year, and for PHE, 8/30 participants had the same experience. All participants had regular sight or had corrective lenses and no color vision deficiencies. They were compensated with gift cards for their participation.

2.3 Measurements

Quantitative measurements are based on participants’ performance within the simulation, and the following data were collected:

  1. (1)

    The patterns of evacuation behavior were used to summarize different types of behavior among participants after an alarm is triggered;

  2. (2)

    compliance behavior with the emergency alarm was used to examine evacuation efficiency after an alarm is triggered, and it was registered as a dichotomous response based on “evacuated” (participants evacuated from the building) or “not evacuated” (participants remained in the building);

  3. (3)

    Three open-ended questions were asked to understand the reasons/motivations behind behaviors: (a) What is the meaning of the sound? (b) What did you do after the sound, and c) Why?

  4. (4)

    The simulator sickness Questionnaire (adapted from Kennedy et al. 1993), the presence Questionnaire (adapted from Witmer and Singer 1998), and the usability Questionnaire (adapted from Witmer and Singer 1998) were mainly used as control variables.

All subjective questions and scales used were administered in Chinese and later were single back-translated into English (Talbert et al. 2013).

2.4 Narrative

The sense of the presence of human subjects is a fundamental factor that determines the usefulness of virtual environments and the ecological validity of VE-based studies (Zou et al. 2017; Schuemie et al. 2001). Achieving a sufficient sense of the presence, it is essential to highlight the importance of narrative context (Gorini et al. 2011) in creating a credible scenario, which can increase the illusion of participants being in a virtual environment but acting as if they were in the real world.

For this study, a narrative context where the participants need to be attending a dream job interview at an international hi-tech company called “Godden Silver (GS)” was created. The context described general information, such as core values, current employment situation, and global development strategy. It specifically introduced the GS working benefits for all professions, such as a nearby location, relatively high salary, complete holiday and insurance system, and even education funding for kids. This opportunity is considered lucrative for most employees worldwide, and fortunately, the company has scheduled a first-round simulation job interview with the participant. The competition was informed for involving participants in the simulation, in which they needed to outperform other candidates to improve their chances of getting into the next round.

A pilot test was conducted with five volunteers to test the narrative, mainly considering its engagement, credibility, and easy understanding. They all reported that the narrative was easy to understand and found the content convincing and attractive.

2.5 Virtual environment (VE)

According to the narrative, a VE was developed based on requirements generated in a brainstorming session with six experts in ergonomics (1); architecture (1); design (2); psychology (1); and computer engineering (1). The building has two floors (ground floor: the main entrance, reception area, hall_1, and stairs to the first floor; first floor: the lobby, hall 2, and the meeting room). See details in Figs. 1 and 2.

Fig. 1
figure 1

Floor plan—letters are telepoints with log systems (explained in Sect. 2.7), and the dash lines connected in between indicate the walking path

Fig. 2
figure 2

Views of the reception area and meeting room

The VE was built with Unity (version: 2019.2.14f1) and was enriched with elements to create a more realistic scenario, such as colors, textures, lights, and objects. Direction signs to the meeting room and emergency exit signs were also set up. Plugins for automatic data collection (measurement of human behavior) designed by ergoUX—Ergonomics and User Experience Lab in the School of Architecture, University of Lisbon, were used.

2.6 Trial scenario

A different scenario (a trial VE) was used for training purposes (developed according to the primary interaction behaviors that will occur in experimental VE) to get participants acquainted with the VR setup, homogenize differences in performance of using teleporting, and bring their virtual movements closer to their realistic/natural actions. Each participant interacted with this trial scenario to practice navigation and content selection on a touch screen using VR hand controllers until being able to complete these operations without difficulties.

2.7 Interactions in the VE

The locomotion inside the VE is always challenging (see Zhao et al. 2020 for a review), and teleportation is what we used in the current study. This approach allows participants to explore/visit critical areas inside the environments with a natural head movement for observation and makes it easier for the researcher to track and analyze people’s movement decisions. Most importantly, it avoids cybersickness caused by continuous movement—steering locomotion (Christou and Aristidou 2017). However, it also comes with problems, such as eliminating the transitional optic flow, which has been proven to be a source of disorientation when people navigate in large-scale environments (Zhao et al. 2020).To avoid disorientation, the virtual environment was designed considering the strategy of using indoor open spaces segmented according to the telepoints for the reception area with single access that was the stair for the next level. Barriers were placed to direct participants to the meeting room (destination point). It reduced the complexity of the building design, avoiding disorientation problems. People also have information about the task they had to perform during the simulation, as well as directional signage to inform them about the path they need to follow.

Participants can jump from one telepoint to another using a controller-emitted laser to point & select the target telepoint. There is at least one telepoint in each of the seven sections inside the virtual building (see Fig. 1) for participants to reach and explore the areas, and several telepoints were placed in between them to ensure the path is connected equally in travel distance. A total of 18 telepoints were deployed in the virtual building (see the 18 white dots in Fig. 1), and participants had to follow them in sequence (cannot jump over more than one telepoint).

11 telepoints were selected as the critical points for analyzing participants’ movement decisions during the emergency. And they were programmed with a log system to register the reaching time, which will be used for behavior analysis. The 11 telepoints are marked with letters (from A to K) and shown in Fig. 1. All participants started from the main entrance in the VE.

2.7.1 Interaction in the PHE condition

In the PHE experimental condition, when participants reached the reception area, they were informed by the receptionist that the interview would be held in a meeting room at the upper level of the building. They must interact with a standing machine with a touch screen to participate in the interview, a quiz session (see the first picture in Fig. 3).

Fig. 3
figure 3

From left to right: interview machine, quiz interface, and sign board in the sofa area

Participants needed to go upstairs using the stairs and locate the meeting room by direction signs. After reaching the interview machine, participants received guidelines from the machine’s screen for the quiz sessions (40 questions). Guidelines inform the rules participants need to follow: (a) there is no feedback in any form from the machine, and (b) there is no time limit.

Forty multichoice general questions were collected from the internet and planned to be easy and funny. A question like “What's the biggest animal in the world”? Has four answers to be chosen from mouse, dog, elephant, or the blue whale. Pretests with five workers were performed (same samples as the final experiment) to evaluate the engagement level of all questions. They all reported that the questions were interesting and easy to answer.

The operation on the machine was simple. After choosing an answer from four options, there would be a “Yes” button for confirmation to go to the next question and a “No” button for reselection (see the second picture in Fig. 3).

The alarm started 50 seconds after participants touched the first button on the screen. The entire PHE session ended when one of the following occurred: (i) 5 minutes after the alarm started; (ii) participant reached back to the receptionist (B point); (iii) participant reached back to the main entrance (A point); and (iv) participant finished all 40 questions.

2.7.2 Interaction in the PLE condition

In the PLE condition, the information participants received from the receptionist would be the location where the interview will be held, and “please wait in the sofa area inside the meeting room, for the interview to come soon.” Instead of an interaction machine set up at the J point, a signboard stands up indicating the sofa area (see the last picture in Fig. 3.).

The alarm started 50 seconds after the participant reached the sofa area (J point). The entire PLE session ended when one of the following occurred: (i) 5 minutes after the alarm started; (ii) participant reached back to the receptionist (B point); and (iii) participant reached back to the main entrance (A point).

This condition was considered a control condition, which provides a baseline for comparing of human behaviors toward the ISO-type evacuation alarm and helps to verify the influence of pre-emergency engagement on evacuation efficiency.

2.8 Apparatus

Participants viewed the VE (first-person egocentric viewpoint) through a head-mounted display (HMD) from HTC VIVE Pro, and two hand controllers (teleporting) from HTC were used as a navigation device. Headphones were equipped with this HMD, allowing participants to listen to instructions and alarm sounds. The simulation runs on a Microsoft Windows graphics workstation equipped with an NVIDIA GTX1060 Graphic card, and an external monitor was used to display the same image of the VE that was being displayed to the participants.

2.9 Procedure

All experiments took place in a Chinese company from where all participants were recruited. Upon arriving at the experiment room, participants received a brief explanation from the researcher about the experiment. Participants were asked to sign the consent form in which they agreed to participate in the study as a volunteer and to allow images to be recorded. Participants were unaware of the real objective of the experiment. They were told that the experiment aimed to evaluate new VR software, and they ought to fulfill specific tasks as accurately and naturally as possible. All participants were advised they could stop the simulation anytime they wanted.

The complete VR experiment consists of a training stage (trail scenario) and an experimental session. After the trial session, participants were required to read the narrative text before the experimental session, and then, they were assigned to one of the two manipulated conditions. Once the simulation started, no dialog between the participant and the researcher occurred. After the experimental session, participants were asked to fill out the demographic questionnaire, answer open-ended questions and finish the simulator sickness, presence, and usability questionnaires.

3 Results

Data were analyzed considering the study objectives and the defined hypothesis (i.e., the level of task engagement will influence human behavioral compliance with evacuation alarms during the pre-evacuation period).

Concerning the patterns of evacuation behavior results, after receiving information from the receptionist, participants went to the meeting room and engaged in the activity by waiting in the sofa area (PLE) or interacting with the quiz machine (PHE). Fifty seconds later, the alarm started, and a total of five types of evacuation behavior were verified in both conditions (see Table 1).

Table 1 Patterns of five evacuation behavior in two conditions

Most participants in the PLE condition had behavior type C (53.3%) that they went around in the environment first before reaching directly to the receptionist/entrance. However, in the PHE condition, most of the participants had behavior type B (73.3%), and they remained at the task/machine after the alarm sounded. The second large portion of participants in the PLE had the behavior type E (23.3%), who kept going around until the end of the simulation, followed by behavior type A (13.3%)—participants decided to go directly to the receptionist/entrance, behavior type D (6.7%)—participants went around in the environment first before returning to the task, and behavior type B (3.3%)—participants remained at the task. Whereas, in the PHE, the second large portion had behavior type C (16.7%), followed by type D (6.7%) and type A (3.3%), and there was no participant who had behavior type E (0%).

Concerning the compliance behavior results, to be continued, the behavior types A and C were considered as “evacuated” (participants evacuated from the building), and the behavior types B, D, E were considered “not evacuation” (participants remained in the building). As shown in Table 2, among the 60 participants, a total of 26 participants (43.3%) evacuated the building, and 34 participants (56.7%) did not evacuate. Table 2 also shows the frequencies and percentages of compliance behaviors of each condition. In the PLE, 20 participants (66.7%) evacuated the building, and 10 participants (33.3%) did not evacuate. In the PHE, 6 participants (20%) evacuated the building, and 24 participants (80%) did not evacuate.

Table 2 Results of compliance behavior in two conditions

To test whether participants’ compliance behavior is independent of the level of pre-emergency engagement, SPSS software was used to crosstab experimental conditions (PHE/PLE) with compliance, and chi-square tests were performed to test whether compliance behavior was different depending on the level of engagement in the task (experimental condition). Results showed that the level of compliance was not independent of the level of engagement (p < 001; X = 13.303), and the results of compliance behavior of PLE were statistically higher than PHE (a hypothesis confirmed).

Concerning the results of open-ended questions, we found out that participants had different understandings about the alarm—categorized in levels—which further guided their evacuation behaviors:

  1. (1)

    Level I: for participants who understood the evacuation alarm (e.g., had specific training/education/experience), after perceiving the evacuation signal, they evacuated directly (Kinateder et al. 2015; Wogalter 2006; Proulx and Laroche 2003).

  2. (2)

    Level II: for participants who did not perceive the alarm as an evacuation signal due to cue ambiguity (Kuligowski and Mileti 2009), however, they related the signal to emergency/danger, and they performed information-seeking behaviors (Kuligowski and Hoskins 2010; SFPE 2019; Rahouti et al. 2020), such as sought for additional information, investigating the situation, asking the receptionist for help, and even went back to continue the pre-evacuation task after no danger signs were found in the environment (e.g., fire) (Wogalter 2006; Kuligowski and Mileti 2009), while keeping an eye on surroundings for new signs of danger.

  3. (3)

    Level III: for participants who did neither know the evacuation alarm nor related the sound to emergency/danger, after receiving the sound, they performed safety-irrelevant behaviors that varied in two conditions:

    1. (i)

      In PLE, participants often related the sound to simulation system warnings, such as wrong movements in VE (e.g., not sitting on the sofa properly/in time and stepping into a forbidden area), wrong operations on hand controllers, or even simulation bugs on VR devices. As a result, they wasted much time trying irrelevant things in the VE (e.g., tried to correct their movements, searched for the sound source and teleported around with no purpose);

    2. (ii)

      In the PHE, participants often related the sound to the interview, such as a reminder of wrong answers in the quiz, the countdown of quiz session, or even the sound of the machine breaking down. As a result, they concentrated even more on the task (answering questions).

See Table 3 for the frequencies of each level of knowledge background existing in different evacuation behaviors between two conditions.

Table 3 Frequencies of alarm background knowledge levels in different compliance behaviors

There were 5 participants of level I in the PLE, and expectedly, they all evacuated the building (Kinateder et al. 2015). However, 2 of them did not evacuate directly (behavior C—went around before evacuation), as they reported that they got lost on their way out due to unfamiliar escaping route (Averill et al. 2019; Andrée et al. 2016); there were 15 participants of level II in PLE, and reasonably, 13 of them evacuated. However, 12 out of the 13 performed behavior C and reported that they went to the receptionist for inquiries about the unhappened interview, not for evacuation. As for the 2 with behavior D, they both reported that they continued the pre-evacuation task after no danger signs were found; there were 10 participants of level III in PLE, and expectedly, 8 of them did not evacuate (behavior B:1, and behavior E: 7—performed safety-irrelevant behaviors in the VE), and as for the 2 (behavior C) who evacuated, they also reported that they did not reach to the receptionist for evacuation, but searching for the sound source.

In the PHE, there were 3 level I participants who were supposed to evacuate directly. However, 2 of them remained at the task (behavior B) due to high engagement on pre-evacuation activity (Averill et al. 2013; Chandler and Sweller 1991), and they all reported that they neglected the alarm by mistakes; there were 14 participants of level II in PHE, however, unlike the situation in PLE where most participants of level II had behavior C (13/15)—evacuated, only 5/14 participants of level II in PHE had behavior C, and the rest 9 participants did not evacuate due to high engagement activity (behavior B:7, behavior D: 2); there were 13 participants of level III in PHE, and expectedly, all of them remained at the task (behavior B).

Concerning the simulator sickness, presence, and usability Questionnaire results, we mainly used these questionnaires as control variables to ensure similarity between conditions because researchers have indicated that the current VR emergency applications do not cause too much simulator sickness, and people experience a reasonable degree of presence and usability (Bourhim and Cherkaoui 2020; Vilar et al. 2014). For the simulator sickness, 23 symptoms (adapted from Kennedy et al. 1993) were used to assess simulator sickness after simulation. Participants’ subjective severity rantings were based on a scale from 0 (no perception) to 3 (severe perception). A factor analysis categorized all symptoms into three non-mutually exclusive categories: Nausea, Oculomotor disturbance, and Disorientation, and weights were assigned to each category and summed together into a single total score (TS) which provides a description of overall simulator sickness scores for a given simulation. The TS between conditions is close, with the mean and standard deviation scores of 6.9 (4.5) and 6.3 (3.2), respectively, which was in the minimal symptom level range (5 to 10), indicating very slight simulator sickness symptoms (Stanney et al. 1997). For the presence results, five aspects of the presence level were measured (Immersion, Control, Sensory, Distracting, and Realism). The responses were based on a 7-point visual analog scale, with “1” indicating disagree, “4” indicating neutral, and “7” indicating agree, to calculate the average score and standard deviation. The overall presence scores between conditions are close and are more than neutral  −  4.7 (1.8) and 4.9 (1.9), meaning a qualified level of the presence (Witmer and Singer 1998). For the usability results, key aspects were measured (narrative context and VE coherence, interaction and visual quality, simulation duration, body movement, and head orientation speed). The responses were based on the same 7-point scale, and the overall usability scores are also qualified and similar between conditions − 4.5 (1.4) and 4.7 (1.5).

4 Discussion

The present study is presented, as its main objectives, to verify the use of VR-based methodology as a helpful way for studying human behaviors during the pre-evacuation period of building emergencies. For this, a virtual emergency evacuation experiment is conducted, mainly considering different engagement levels on pre-evacuation activities (i.e., low and high). Its results were compared with those obtained from the literature, which usually had the real world as an interaction environment.

Firstly, the defined five types of evacuation behavior (results of Patterns of Evacuation Behavior) are consistent with previous studies (Lin et al. 2020; Gershon et al. 2012). Furthermore, in PLE, most participants went around in the environment first before evacuation (behavior type C—53.3%), and combined with the result of open-ended questions, this behavioral phenomenon is consistent with prior studies, which indicated that people searched for additional information for cue validation since they did not recognize the sound as an evacuation signal and perceive enough urgency from its ambiguity (Wogalter 2006; Rahouti et al. 2020). In PHE, most participants (behavior type B—73.3%) remained at the task, which is also consistent with a prior study (Kuligowski and Hoskins 2010) since 65% of the people who had experienced emergencies maintained their pre-emergency activities. Moreover, some participants of PHE misinterpreted the alarm as a sign of answering wrong or the machine breaking down, which concentrated them even more on answering questions in the machine. This behavior shows the influence of the pre-evacuation activity on delaying people's emergency response. People who have tasks to do and have invested a significant amount of effort in their activity will be more attached to tasks and reluctant to disengage that activity in response to an ambiguous signal (Gershon et al. 2012; Lin et al. 2020; Liu and Lo 2011). However, in the PLE condition, where participants had relatively nothing to do (low engaged situation), they responded faster and performed information-seeking behaviors when facing ambiguous signals (Averill et al. 2013; Lindell and Perry 2012). As a result, they evacuated (went around and reached the receptionist/main entrance) not because they understood the T-3 signal but because they were searching for additional information due to the ambiguous cue. Last but not least, participants confused the sound generated in front of them (the interview machine) with that from the surrounding environment (e.g., fire alarm). We encountered this problem in the early phase of environment design, mainly when the stereo sound system was not applied. So, we carefully placed the fire alarm device (with stereo sound system) in room Hall_2 next to the Meeting Room where the machine interview took place (see Fig. 1) to prevent the fire alarm sound from being confused with feedback from the machine. As a result, participants located the alarm device by listening to the stereo-directional sounds in the pre-test phase. However, some may still be confused about the sound, as they reported that they did not know what to do but related the sound to the ongoing activity—the interview machine, which is also consistent with our hypothesis.

Secondly, the low evacuation rate of the ISO-type evacuation alarm obtained in the compliance behavior results is expected, as Proulx and their team concluded that it is not realistic to expect occupants to start evacuation upon perceiving such a signal because they can rarely identify the regular ISO-type alarm as an evacuation signal (Proulx and Laroche 2003). Furthermore, the compliance behavior results reflected the influence of pre-evacuation activity on people’s evacuation behaviors higher (Kuligowski and Hoskins 2010) as the evacuation efficiency of PLE is higher than PHE, and more importantly, the statistical result also confirmed the significant influence of pre-evacuation engagement on human behavior compliance (a hypothesis confirmed). Thus, the adequacy of the VR-based methodology is confirmed because the experiment setup detected the differences of engagement between manipulated conditions and produced evacuation behaviors similar to real situations.

Thirdly, the three levels of knowledge background concluded from the open-ended questions results further justified the motivations/reasons behind different evacuation behaviors, which is also consistent with the previous study because personal experience toward the alarm is one of the critical factors affecting participants’ compliance behavior (Kinateder et al. 2015).

Lastly, the simulator sickness, presence, and usability Questionnaires mainly served as control variables to check the similarity between conditions. All results between conditions are similar and adequate. Thus, variables were controlled successfully.

Based on a solid ground of VR-based methodology in studying human behaviors in the movement period of an emergency evacuation (Kinateder and Warren 2016; Vilar et al. 2014; Duarte et al. 2014), this study took the novelty of exploring the influence of pre-emergency engagement on evacuation behaviors during the pre-evacuation period (Rahouti et al. 2020; Kuligowski and Hoskins 2010). In summary, our VR-based evacuation study successfully reproduced the inefficiency of the ISO-type evacuation alarm, the negative influence of pre-emergency engagement on evacuation efficiency, and the different evacuation behaviors similar to the literature (Gershon et al. 2012; SFPE, 2019; Zhao et al. 2009; Averill et al. 2019) and also elicited the needs for the development of the next generation of evacuation alarms. The study confirmed the effectiveness of the presented methodology, contributing to a deeper understanding of human–media interaction that will eventually improve human safety in buildings.

5 Limitation

As we understood that, the best way to verify the effectiveness of VR as a helpful way for studying human behaviors during the pre-evacuation period (danger situation) is to replicate a real-life experimental scene and compare simulation data with real experiment results (Arias et al. 2019; Bourhim and Cherkaoui 2020). However, this would still put people in risks. In our study, where we measured participants' emergency responses to the ISO-type evacuation alarm in a critical situation, we could only verify that the participants' behavior was consistent with what occurred in critical situations reported in the literature (e.g., fires and terrorist attacks). To be precise, referring to the phenomenon that, after hearing the alarm, people engaged in a task do not immediately decide to evacuate. Furthermore, our virtual setup can be an essential reference for exploring/testing the next generation of alarm proposals for future studies.

Compared to real situations, our experimental settings were relatively simple, for example: (1) no virtual settings of detailed information-seeking and preparation behaviors that allow participants to perform during the pre-evacuation period, such as communicating with other occupants, collecting personal belongings and getting dressed (Lin et al. 2020); (2) the current VR setup involved only visual and audible stimuli, yet, for a real situation, smell, tactile and haptic information should be considered for producing more immersive experiences, improving ecological validity (SEFP 2019); and (3) no interaction possible with the virtual receptionist after participants reached back to it, and their behavior intention can only be verified through interviews. However, it would not otherwise be possible to study pre-evacuation human behavior in such depth without VR technology. Future improvements of VE settings are needed.

We did not consider the effect of “social influence” on behavior compliance (Kinateder and Warren 2016). In building environments, occupants are not alone, and they have connections/bonds with others for different reasons (SFPE 2019). Building occupants will try to fit social norms and behave what other people expect, influencing their emergency reactions/responses toward the alarm when situations occur (Lovreglio et al. 2014). In future study, other building occupants should be inserted in the VE, and the influence of different roles such as manager or customs should also be considered (Nilsson and Johansson 2009).

In the near future, many of the issues and future works aforesaid might be addressed, and some of them may contribute with new methods and others using new approaches to investigate human compliance behaviors. This work intends to answer a little question of many that will remain.